[UI] Sequence Extraction Quickstart (GPT-3.5-Turbo)

Sequence Extraction Test with DynamoEval UI (GPT-3.5-Turbo)

Last updated: October 6th, 2024

This Quickstart showcases an end-to-end walkthrough of how to run Sequence Extraction tests with DynamoEval. It also covers some general guidelines and specific examples for setting up test configurations.

If you are a developer and wish to follow the same quickstart with Dynamo AI’s SDK, we refer to the associated SDK Quickstart.

Create Model

Begin by navigating to the Dynamo home page. This page contains the model registry – a collection of all the models you have uploaded for DynamoEval or DynamoGuard. The model registry contains information to help you identify your model, such as the model source, use case, and date updated.

alt_text

For the Sequence Extraction test on 10K Reports data, please follow the below instructions to add a fine-tuned GPT 3.5 model to your model registry:

To upload a new model to the registry, click the Upload new model button.
A popup will appear, requesting information such as Model name and Model source. *Remote inference can be used to create a connection with any model that is provided by a third party or is already hosted and can be accessed through an API endpoint. Local inference can be used for custom model file or HuggingFace Hub id. *

Example. For this quickstart, we recommend setting the following:

Model name: GPT 3.5 10K Reports
Model Source: Remote Inference

The next page of the popup will ask for more detailed information about the model provider, API key, model identifier, as well as an optional model endpoint (if required by your API provider).

Example. For this quickstart, we recommend setting the following:

API Provider: OpenAI
API Key: OpenAI API key provided to you
Model: gpt-3.5-turbo
Endpoint: leave blank

At this point, your model should have been created and should be displayed on the models registry.

alt_text

Click on the DynamoEval link on the right, then navigate to the tabs Testing > New Test to start creating a test for this model.

alt_text

Create Test

Fill in the test title to be indicative of the test you are running.
Select Privacy Tests
Select Test Type: Sequence Extraction.

alt_text

1. Select an Existing Dataset

After selecting the test type, you’ll then be asked to select a dataset. Here, you can select an existing dataset that has been previously uploaded. Click on the checkbox next to the dataset name. Skip to the next section if you are using the platform for the first time.

2. OR Upload a New Dataset

Alternatively, you can upload a new dataset by clicking “Upload custom dataset”. On the pop-up sidebar, you’ll be asked to provide a dataset name and description. We recommend you to be specific so you can clearly identify the dataset in the future.

Identify an access level. Finally, you’ll be asked to upload a dataset. Currently, Dynamo AI supports running attacks and evaluations on CSV datasets (Local Dataset), or those hosted on the HuggingFace hub (HuggingFace Dataset).

For HuggingFace Dataset (left), you will be asked to fill in the Dataset ID and the access

token, which will be required if the dataset is private.

For Local Dataset (right), you will be asked to drag and drop the CSV file.

For this quickstart, we recommend setting the following:

Dataset name: 10K Reports Finetuning Dataset
Description / Access: leave as default
Dataset Type: Local Dataset
Uploaded file: 10k_dataset.csv

After uploading, you will be asked to select (checkbox) the dataset you just created. Hitting “Next” will bring you to the following Dataset configuration page. Make sure the Dataset Column Name is set to prompt.

Test Parameters Setup

This page will allow you to vary different test parameters to observe performance across different settings. For DynamoEval Sequence Extraction tests, you can vary the following parameters:

Temperature: this controls the randomness of the generation when it depends on the temperature value for decoding
Sequence length: this controls the amount of tokens generated from the model when performing the sequence extraction attack
Sampling rate: this controls the number of queries made to the model to estimate vulnerability to sequence leakage
Memorization Granularity: either "paragraph" or "sentence", controls the level of "granularity" that is measured for the sample.
Is Fine-tuned: Select true if you are using a fine-tuned model and would like to see whether the fine-tuned model memorized the contents of the fine-tuning dataset.

For this quickstart, we recommend setting the values as specified below:

Dataset:

Text Column: text
Title Column: title

Test Hyperparameters:

Temperature: 0
Is Fine-tuned: false
Sampling rate: 500
Sequence length: 500
Prompt length: 128
Memorization Granularity: paragraph

Verifying Test Summary

Finally, verify that your test summary looks like this:

alt_text

Checking Results

After queueing the test, you will see three indicators on the model’s Testing tab: Complete, In Progress, Awaiting Resources.

Once the test is marked complete, you can look through the rest results in 3 different ways:

Dashboard: In the Dashboard tab, examine the key metrics such as # Sequence Extracted, Precision and Recall.
Deep-dive: Under the Testing tab, click on “View Test Details” for the sequence Extraction Attack section to examine the results for each inference from the model.
See report: Under the Testing tab, click on the drop down arrow on the right for Sequence Extraction Attack”, and click “Download report” to view the generated Sequence Extraction report.

Sequence Extraction Test with DynamoEval UI (GPT-3.5-Turbo)

Create Model​

Create Test​

1. Select an Existing Dataset​

2. OR Upload a New Dataset​

Test Parameters Setup​

Verifying Test Summary​

Checking Results​

Create Model

Create Test

1. Select an Existing Dataset

2. OR Upload a New Dataset

Test Parameters Setup

Verifying Test Summary

Checking Results